-
Notifications
You must be signed in to change notification settings - Fork 589
Convert Perl utf16 to utf8 functions to macros #23554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
These functions are hereby removed in favor of calling the plain macros that already exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly wondering about the "why";
I guess what I'm really wondering about is why where they added as functions in the first place?
This wasn't feasible until 93f23f0. The commit message there is poorly worded. If the function did not have a thread context, it could be a macro. That commit allowed functions with a thread context to also be. Basically no one had bothered (or been able to figure out how) to deal with macros whose first parameter, the thread context, isn't always there. The problem is that you need a comma separating the formal parameters in a macro definition. In a function definition the parameter can be pTHX_ expanding to nothing, but you can't say
I've thought about this for a long time. There might be a way to do it if the header files are called recursively. Most header files have an #ifndef GUARD at the beginning to prevent recursive calling. I actually filed a ticket against K&R C to make the guard the default, but got no answer. Others told me it was probably because RItchie used it for some trickery, maybe some trojan horse like he put in the compiler. I've never sat down and tried to figure out how this might work to some advantage. And I've never seen an example of it being used. The above commit enables this behavior for things listed in embed.fnc. Future commits I am envisioning will expand on this. It works by defining two cases, one if there are threads; and one without |
This commit would break CPAN unmodified source code compatibility, if the CPAN module authors decided to be risk takers and use unauthorized by P5P libperl features.
These APIs are very poorly designed and outright obfuscated.
Time to open Ghidera to understand this JAPH GOLF contest.
Symbol's Perl_utf16_to_utf8_reversed prototype is very poor should have never landed in the repo in the first place. The 5th arg is spaghetti/obfuscation. Line 3372 in 07457a3
3 callers+1 exp sym tab entry at -O1 with LTO. Since its an extern sym, LTO is illegal to do on it. C is C, ABI is ABI.
7 arguments!!!!
128 bits of storage were used to encode 2 bits of information. Or in metric, 16 bytes of storage, were used to encode 1 byte of information.
ISO C looks at these 3 C features, Now lets keep deciphering this unreadable character conversion algorithm.
Why isn't the retval documented? I see a
Actually a good practice. Don't use
Why do we have a free retval cpu register we aren't using? Again, poor prototype layout. &uv and &retlen are now declared with ISO C's SysV ABI .pdf guarentees linux's C ABI has a U128 retval register on AMD64 CPUs. MS's i386 ABI has a U64 register retval guaranteed per ABI .pdf. That will hold and transport Perl's char* and STRLEN vars from MS's Win64 ABI .pdf is brain damaged, and retval is a U64 register on AMD 64 arch. @bulk88 has convinced MSVC Clang and GCC to give Perl C linker fn symbols compiled Win64 on AMD64. There is a secret loop hole in the Win64 for AMD64 ABI document that will give me a U128 integer retval type in MS-flavored ISC C land on MSVC/GCC/Clang ;-) And if there is a
looks like a branchless algo, good. I'm not happy about the unaligned
the last 2 lines make no sense, Either return the buffer length/array length, as the retval, or return ptr to last byte or ptr to last byte +1, Why is this function returning imperial units in a outgoing data Next stage of decrypting this code, ^^^^ isn't C code. |/|/|/|/|/|/ below is C code.
I have a trick using
I hope this is getting constant folded away by all 3 big C compilers. But we are in the function body of an
Now lets send this C code through a -O1/-O2 C compiler, and figure out what it does in real life. This is MSVC 2022 -O1 LTO on.
WHAT IS THAT 5 ARGUMENT function call doing in this hottest control flow path of this for() loop!!!!!!! WHY ARE THERE 2 0x000000 const literals being written to the C stack? on each iteration of this loop? This is a disaster.
!@#$%##$% March 2025 PR I didn't comment in time/didn't see in time, that degraded ithread perl on every platform
Demo that I as a CPAN author, can [unauthorized] use the symbols that are proposed to be deleted by this commit.
Defeated, gotta use the long name trick.
I win the game. There was no prize in this game. Its just a magic trick. I can always copy paste whatever I want from the repo anyways as a CPAN author, a trick another CPAN dev more experienced than me told is his solution to all private internal APIs that P5P doesn't want to stick in For research/archive purposes, here is a full asm dump of what this function does at
I have to goto work, i'm not reverse engineering what symbol
is doing with its 7 c stack arguments in a super hot loop. Best solution for all the issues I described is just to delete all of these functions/lines of code, and copy paste something from another tried and tested and trusted with a large user base BSD licensed FOSS software project and not think too much more into this API feature, Perl's attempt at reinventing the wheel made the wheel a triangle . |
dictionary.bin is this ~120 kb file https://github.com/google/brotli/blob/master/c/common/dictionary.bin
A professionally written algorithm is 5.5x faster than Perl's C code. Everyone should start searching GH for BSD software written in C and find something to copy paste into perl. |
there was a bug above in the benchmark, it was only looping 1.7 kb,
Someone should try whatever the typical . |
These functions are hereby removed in favor of calling the plain macros that already exist